Learning with Lq<1 vs L1-Norm Regularisation with Exponentially Many Irrelevant Features

نویسندگان

Ata Kabán

Robert J. Durrant

چکیده

We study the use of fractional norms for regularisation in supervised learning from high dimensional data, in conditions of a large number of irrelevant features, focusing on logistic regression. We develop a variational method for parameter estimation, and show an equivalence between two approximations recently proposed in the statistics literature. Building on previous work by A.Ng, we show the fractional norm regularised logistic regression enjoys a sample complexity that grows logarithmically with the data dimensions and polynomially with the number of relevant dimensions. In addition, extensive empirical testing indicates that fractional-norm regularisation is more suitable than L1 in cases when the number of relevant features is very small, and works very well despite a large number of irrelevant features. 1 Lq<1-Regularised Logistic Regression Consider a training set of pairs z = {(xj , yj)}j=1 drawn i.i.d. from some unknown distribution P . xj ∈ R are m-dimensional input points and yj ∈ {−1, 1} are the associated target labels for these points. Given z, the aim in supervised learning is to learn a mapping from inputs to targets that is then able to predict the target values for previously unseen points that follow the same distribution as the training data. We are interested in problems with large number m of input features, of which only a few r << m are relevant to the target. In particular, we focus on a form of regularised logistic regression for this purpose:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient L1/Lq Norm Regularization

Sparse learning has recently received increasing attention in many areas including machine learning, statistics, and applied mathematics. The mixed-norm regularization based on the l1/lq norm with q > 1 is attractive in many applications of regression and classification in that it facilitates group sparsity in the model. The resulting optimization problem is, however, challenging to solve due t...

متن کامل

Feature selection, L1 vs. L2 regularization, and rotational invariance

We consider supervised learning in the presence of very many irrelevant features, and study two different regularization methods for preventing overfitting. Focusing on logistic regression, we show that using L1 regularization of the parameters, the sample complexity (i.e., the number of training examples required to learn “well,”) grows only logarithmically in the number of irrelevant features...

متن کامل

Efficient Mixed-Norm Regularization: Algorithms and Safe Screening Methods

متن کامل

Learning Robust Graph Regularisation for Subspace Clustering

Various subspace clustering methods have benefited from introducing a graph regularisation term in their objective functions. In this work, we identify two critical limitations of the graph regularisation term employed in existing subspace clustering models and provide solutions for both of them. First, the squared l2-norm used in the existing term is replaced by a l1-norm term to make the regu...

متن کامل

A Dirty Model for Multi-task Learning

We consider multi-task learning in the setting of multiple linear regression, and where some relevant features could be shared across the tasks. Recent research has studied the use of l1/lq norm block-regularizations with q > 1 for such blocksparse structured problems, establishing strong guarantees on recovery even under high-dimensional scaling where the number of features scale with the numb...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Learning with Lq<1 vs L1-Norm Regularisation with Exponentially Many Irrelevant Features

نویسندگان

چکیده

منابع مشابه

Efficient L1/Lq Norm Regularization

Feature selection, L1 vs. L2 regularization, and rotational invariance

Efficient Mixed-Norm Regularization: Algorithms and Safe Screening Methods

Learning Robust Graph Regularisation for Subspace Clustering

A Dirty Model for Multi-task Learning

عنوان ژورنال:

اشتراک گذاری